In-Place Length-Restricted Prefix Coding

نویسندگان

  • Ruy Luiz Milidiú
  • Artur Alves Pessoa
  • Eduardo Sany Laber
چکیده

Huuman codes, combined with word-based models, are considered eecient compression schemes for full-text retrieval systems. The decoding rate for these schemes can be substantially improved if the maximum length of the codewords is not greater then the machine word size L. However, if the vocabulary is large, simple methods for generating optimal length-restricted codes are either too slow or require a signiicantly large amount of memory. In this paper we present an in-place, simple and fast implementation for the BRCI (Initials of Build, Remove, Condense and Insert) algorithm , an approximative method for length-restricted coding. It overwrites a sorted input list of n weights with the corresponding codeword lengths in O(n) time. In addition, the worst-case compression loss introduced by BRCI codes with respect to unrestricted Huuman codes is proved to be negligible for all practical values of both L and n.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Twenty (or so) Questions: $D$-ary Length-Bounded Prefix Coding

Efficient optimal prefix coding has long been accomplished via the Huffman algorithm. However, there is still room for improvement and exploration regarding variants of the Huffman problem. Length-limited Huffman coding, useful for many practical applications, is one such variant, for which codes are restricted to the set of codes in which none of the n codewords is longer than a given length, ...

متن کامل

Optimal Maximal Prefix Coding and Huffman Coding

Huffman coding has been widely used in data, image, and video compression. Novel maximal prefix coding different from the Huffman coding is introduced. Relationships between the Huffman coding and optimal maximal prefix coding are discussed. We show that all Huffman coding schemes are maximal prefix coding schemes and have the shortest average code word length among maximal prefix coding scheme...

متن کامل

Redundancy-Related Bounds on Generalized Huffman Codes

This paper presents new lower and upper bounds for the compression rate of optimal binary prefix codes on memoryless sources according to various nonlinear codeword length objectives. Like the most well-known redundancy bounds for minimum (arithmetic) average redundancy coding — Huffman coding — these are in terms of a form of entropy and/or the probability of the most probable input symbol. Th...

متن کامل

Using an innovative coding algorithm for data encryption∗

This paper discusses the problem of using data compression for encryption. We first propose an algorithm for breaking a prefix-coded file by enumeration. Based on the algorithm, we respectively analyze the complexity of breaking Huffman codes and Shannon-Fano-Elias codes under the assumption that the cryptanalyst knows the code construction rule and the probability mass function of the source. ...

متن کامل

Lossless Coding with Generalised Criteria

This paper presents prefix codes which minimize various criteria constructed as a convex combination of maximum codeword length and average codeword length or maximum redundancy and average redundancy, including a convex combination of the average of an exponential function of the codeword length and the average redundancy. This framework encompasses as a special case several criteria previousl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998